AITopics | output code

Towards AGeneralist Code Embedding Model Based On Massive Data Synthesis

Neural Information Processing SystemsJun-22-2026, 22:33:44 GMT

Code embedding models attract increasing attention due to the widespread popularity of retrieval-augmented generation (RAG) in software development. These models are expected to capture the rich semantic relationships inherent to code, which differ significantly from those found in text. However, existing models remain severely limited due to the scarcity of high-quality training data. In this work, we introduce CodeR (Code Retrieval), a state-of-the-art embedding model for general-purpose code retrieval. The superior performance of CodeR is built upon CodeR-Pile, a large-scale synthetic dataset constructed under the DRU (Diversity, Reliability, Usability) principle via a novel data synthesis pipeline. To optimize training effectiveness, we propose Annealing, a curriculum learning strategy that enables effective knowledge transfer across heterogeneous sources of data. We evaluate CodeR based on 16 diverse code retrieval tasks, where it significantly outperforms existing baselines and exhibits strong out-of-domain generalization performance. We have publicly released our code and the well-trained model to facilitate further research in this critical area3.

large language model, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Education (0.66)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness Hung Le

Neural Information Processing SystemsFeb-16-2026, 22:47:39 GMT

INDICT: a new framework that empowers LLMs with Internal Dialogues of Critiques for both safety and helpfulness guidance.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Asia > Singapore (0.04)
Asia > Indonesia > Bali (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Error Correcting Output Codes Improve Probability Estimation and Adversarial Robustness of Deep Neural Networks

Gunjan Verma, Ananthram Swami

Neural Information Processing SystemsFeb-14-2026, 04:34:29 GMT

Neural Information Processing Systems http://nips.cc/

adversarial example, arxiv preprint arxiv, probability, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland > Prince George's County > Adelphi (0.04)
North America > Canada (0.04)

Industry:

Government > Military (0.47)
Information Technology > Security & Privacy (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)

Add feedback

41792f041a3a0774418791993cf887fe-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 13:55:03 GMT

classifier, codeword, hamming distance, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

ScalabledesignofError-CorrectingOutputCodes usingDiscreteOptimizationwithGraphColoring

Neural Information Processing SystemsFeb-8-2026, 13:54:59 GMT

Error-correcting codes have found many applications in machine learning.

artificial intelligence, codebook, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness Hung Le

Neural Information Processing SystemsOct-10-2025, 11:07:51 GMT

INDICT: a new framework that empowers LLMs with Internal Dialogues of Critiques for both safety and helpfulness guidance.

arxiv preprint arxiv, indict, language model, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Asia > Singapore (0.04)
Asia > Indonesia > Bali (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Error Correcting Output Codes Improve Probability Estimation and Adversarial Robustness of Deep Neural Networks

Gunjan Verma, Ananthram Swami

Neural Information Processing SystemsOct-3-2025, 18:13:12 GMT

From a scientific perspective, the existence of adversarial examples demonstrates that machine learning models that achieve superhuman performance on benign, "naturally occurring"

adversarial example, arxiv preprint arxiv, probability, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland > Prince George's County > Adelphi (0.04)
North America > Canada (0.04)

Industry:

Government > Military (0.47)
Information Technology > Security & Privacy (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)

Add feedback

Contrastive ECOC: Learning Output Codes for Adversarial Defense

Chou, Che-Yu, Chen, Hung-Hsuan

arXiv.org Artificial IntelligenceAug-15-2025

Although one-hot encoding is commonly used for multiclass classification, it is not always the most effective encoding mechanism. Error Correcting Output Codes (ECOC) address multiclass classification by mapping each class to a unique codeword used as a label. Traditional ECOC methods rely on manually designed or randomly generated codebooks, which are labor-intensive and may yield suboptimal, dataset-agnostic results. This paper introduces three models for automated codebook learning based on contrastive learning, allowing codebooks to be learned directly and adaptively from data. Across four datasets, our proposed models demonstrate superior robustness to adversarial attacks compared to two baselines. The source is available at https://github.com/YuChou20/Automated-Codebook-Learning-with-Error-Correcting-Output-Code-Technique.

artificial intelligence, codebook, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2508.10491

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

41792f041a3a0774418791993cf887fe-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-14-2025, 10:20:04 GMT

classifier, codeword, hamming distance, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

Reviews: Error Correcting Output Codes Improve Probability Estimation and Adversarial Robustness of Deep Neural Networks

Neural Information Processing SystemsJan-27-2025, 02:59:15 GMT

Summary: The region of uncertainty (prediction probability close to 0.5) for softmax of logits is extremely small near an M-1 dimensional hyperplane in the logits space. The reason is changing one of the logits for one of the classes affects the probability vectors in all dimensions. The authors show that, if each logit is first converted to an independent probability using 1/(1 exp(-x)) function and the probability vector correlated with each codeword of an error correcting in a soft way to decode, this method has a large volume of uncertainty. The volume of uncertainty is larger when the min hamming distance of the code is large. This because multiple logits must be changed at the same time to cause a wrong decoding.

adversarially, deep neural network, probability estimation and adversarial robustness, (8 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.38)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Filters

Collaborating Authors

output code

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Towards AGeneralist Code Embedding Model Based On Massive Data Synthesis

INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness Hung Le

Error Correcting Output Codes Improve Probability Estimation and Adversarial Robustness of Deep Neural Networks

41792f041a3a0774418791993cf887fe-Supplemental-Conference.pdf

ScalabledesignofError-CorrectingOutputCodes usingDiscreteOptimizationwithGraphColoring

INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness Hung Le

Error Correcting Output Codes Improve Probability Estimation and Adversarial Robustness of Deep Neural Networks

Contrastive ECOC: Learning Output Codes for Adversarial Defense

41792f041a3a0774418791993cf887fe-Supplemental-Conference.pdf

Reviews: Error Correcting Output Codes Improve Probability Estimation and Adversarial Robustness of Deep Neural Networks